{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Ungraded Lab: Decision Boundary\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Goals\n",
    "In this lab you will:\n",
    "- plot the decision boundary for a logistic regression model, which will give you a better sense of what the model is predicting.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Dataset\n",
    "\n",
    "Let's suppose you have following training dataset\n",
    "- The input variable `X` is a numpy array which has 6 training examples, each with two features\n",
    "- The output variable `y` is also a numpy array with 6 examples, and `y` is either `0` or `1`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "X = np.array([[0.5, 1.5], [1,1], [1.5, 0.5], [3, 0.5], [2, 2], [1, 2.5]])\n",
    "y = np.array([0, 0, 0, 1, 1, 1]).reshape(-1,1) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Plot data \n",
    "\n",
    "Let's use a helper function to plot this data. The data points with label $y=1$ are shown as red crosses, while the data points with label $y=0$ are shown as black circles. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from lab_utils import plot_data\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "plot_data(X, y)\n",
    "\n",
    "# Set both axes to be from 0-6\n",
    "plt.axis([0, 6, 0, 6])\n",
    "# Set the y-axis label\n",
    "plt.ylabel('$x_1$')\n",
    "# Set the x-axis label\n",
    "plt.xlabel('$x_0$')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Logistic regression model\n",
    "\n",
    "\n",
    "* Suppose you'd like to train a logistic regression model on this data which has the form   \n",
    "\n",
    "  $f(x) = g(w_0x_0+w_1x_1 + b)$\n",
    "  \n",
    "  where $g(z) = \\frac{1}{1+e^{-z}}$, which is the sigmoid function\n",
    "\n",
    "\n",
    "* Let's say that you trained the model and get the parameters as $b = -3, w_0 = 1, w_1 = 1$. That is,\n",
    "\n",
    "  $f(x) = g(x_0+x_1-3)$\n",
    "\n",
    "  (You'll learn how to fit these parameters to the data further in the course)\n",
    "  \n",
    "  \n",
    "Let's try to understand what this trained model is predicting by plotting it's decision boundary"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Refresher on logistic regression and decision boundary\n",
    "\n",
    "* Recall that for logistic regression, the model is represented as \n",
    "\n",
    "  $f_{\\mathbf{w},b}(x) = g(x^Tw)$\n",
    "\n",
    "  where\n",
    "\n",
    "  $g(z) = \\frac{1}{1+e^{-z}}$\n",
    "  \n",
    "  $g(z)$ is known as the sigmoid function and it maps all input values to values between 0 and 1.\n",
    "  \n",
    "\n",
    "* We interpret the output of the model ($f_{\\mathbf{w},b}(x)$) as the probability that $y=1$ given $x$ and parameterized by $b$ and $w$.\n",
    "* Therefore, to get a final prediction ($y=0$ or $y=1$) from the logistic regression model, we can use the following heuristic -\n",
    "\n",
    "  if $f_{\\mathbf{w},b}(x) >= 0.5$, predict $y=1$\n",
    "  \n",
    "  if $f_{\\mathbf{w},b}(x) < 0.5$, predict $y=0$\n",
    "  \n",
    "  \n",
    "* Since, $f_{\\mathbf{w},b}(x) = g(x^Tw)$, let's plot the sigmoid function to see where $g(z) >= 0.5$\n",
    "\n",
    "We've implemented the `sigmoid` function for you already and you can simply import and use it, as shown in the code block below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "from lab_utils import sigmoid\n",
    "\n",
    "\n",
    "# Plot sigmoid(z) over a range of values from -20 to 20\n",
    "z = np.arange(-20,20)\n",
    "plt.plot(z, sigmoid(z), c=\"b\")\n",
    "\n",
    "# Plot and annotate the point where sigmoid(z) = 0.5\n",
    "plt.plot(0, sigmoid(0), 'ro')\n",
    "plt.text(0, sigmoid(0), '(x={}, y={})'.format(0, sigmoid(0)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* As you can see, $g(z) >= 0.5$ for $z >=0$\n",
    "\n",
    "* For a logistic regression model, $z = x^Tw$. Therefore,\n",
    "\n",
    "  if $x^Tw >= 0$, the model predicts $y=1$\n",
    "  \n",
    "  if $x^Tw < 0$, the model predicts $y=0$\n",
    "  \n",
    "  \n",
    "  \n",
    "### Plotting decision boundary\n",
    "\n",
    "Now, let's go back to our example to understand how the logistic regression model is making predictions.\n",
    "\n",
    "* Our logistic regression model has the form\n",
    "\n",
    "  $f(x) = g(-3 + x_0+x_1)$\n",
    "\n",
    "\n",
    "* From what you've learnt above, you can see that this model predicts $y=1$ if $-3 + x_0+x_1 >= 0$\n",
    "\n",
    "Let's see what this looks like graphically. We'll start by plotting $-3 + x_0+x_1 = 0$, which is equivalent to $x_1 = 3 - x_0$.\n",
    "\n",
    "**Exercise**\n",
    "\n",
    "Complete the code block below by writing `x1 = 3-x0`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Choose values between 0 and 6\n",
    "x0 = np.arange(0,6)\n",
    "\n",
    "### START CODE HERE ### \n",
    "### BEGIN SOLUTION ###  \n",
    "x1 = 3 - x0\n",
    "### END CODE HERE ###\n",
    "### END SOLUTION ###  \n",
    "# Plot the decision boundary\n",
    "plt.plot(x0,x1, c=\"b\")\n",
    "plt.axis([0, 6, 0, 6])\n",
    "\n",
    "# Fill the region below the line\n",
    "plt.fill_between(x0,x1, alpha=0.2)\n",
    "\n",
    "# Plot the original data\n",
    "plot_data(X,y)\n",
    "# Set the y-axis label\n",
    "plt.ylabel('x1')\n",
    "# Set the x-axis label\n",
    "plt.xlabel('x0')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* In the plot above, the blue line should represent the line $x_0 + x_1 - 3 = 0$ and it should intersect the x1 axis at 3 (if we set $x_1$ = 3, $x_0$ = 0) and the x0 axis at 3 (if we set $x_1$ = 0, $x_0$ = 3). \n",
    "\n",
    "\n",
    "* The shaded region represents $-3 + x_0+x_1 < 0$. The region above the line is $-3 + x_0+x_1 > 0$.\n",
    "\n",
    "\n",
    "* Therefore, for this model any point in the shaded region (under the line) is classified as $y=0$ and any point on or above the line is classified as $y=1$. This line is known as the \"decision boundary\".\n",
    "\n",
    "As we've seen in the videos, by using higher order polynomial terms (eg: $f(x) = g( x_0^2 + x_1 -1)$, we can come up with more complex non-linear boundaries."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
